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Method and apparatus for transcoding a digitally compressed high definition television 
bitstream to a standard definition television bitstream. 



The invention relates to techniques for manipulating digitally compressed 
and coded data (for example video information) in order to convert it from one format (or 
specification) to another. Such a method will be referred to herein as digital transcoding, 
and a device with such a functionality will be referred to as a digital transcoder. 

5 The digital video compression standard developed by International 

Standardization Organization's (ISO) Moving Picture Expert Group (MPEG) is becoming a 
key technology in the delivery of digital video programs over a wide variety of media such 
as terrestrial broadcasting, telecommunication, and cable. It is now almost certain that a 
digital HDTV (high definition television) standard similar or compatible with standards 

10 recommended by MPEG will be used for terrestrial HDTV transmission in North America, 
and Europe. Similar technology will also be used to provide HDTV and standard definition 
television (SDTV) over cable, phone, fiber optic, satellite and ISDN networks. 



15 U.S. Patent No. 5,243,428 discusses the MPEG standard as well as its 

block and frame coding protocols. This patent is incorporated by reference herein. Further 
details about inter/intra frame and MPEG like video coding can also be found in the 
following references which are also incorporated by reference herein: 

MPEG: A Video Compression Standard For Multimedia Applications ; Le 
20 Gall, Communications of the ACM, Vol.34, No, 4, April, 1991. 

Advanced Digital Communications . Feher, Prentice-Hall Inc., Englewood 

CUffs, N.J. (1987); 

The Digital Simulcast AD-HDTV Coding System . IEEE Transactions on 
Consumer Electronics, Vol. 38, No. 4, November, 1992; 
25 Information Technology-Generic Coding of Moving Pictures and 

Associated Audio . ISO/IEC committee draft, November 1993; and 

Test Model S. Draft-Test Model Editing Committee . ISO/IEC , April, 

1993. 
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SDTV is defined herein as a digitally encoded television signal which can 
deliver a television picture comparable in overall format and resolution to conventional (e.g. 
NTSC or PAL) type television pictures. Using techniques for coding HDTV source signals 
into conventional television channel bandwidths (e.g. 6MHz.), several SDTV programs can 
5 be provided on each channel instead of a single HDTV program. 

Introduction of HDTV will probably begin before receivers capable of 
reproducing the full HDTV source signal are either available or affordable by most viewers. 
There will therefore be a need to convert HDTV signals to SDTV signals (i.e.transcode 
them) so that they can be further processed for display on conventional (e.g. NTSC) 

10 television receivers which will only be able to decode and display standard definition video. 

A digital transcoder may be located at an intermediate stage in the 
transmitting chain or as part of a telecommunication network such as at a head-end or at a 
network switch. As currently envisioned, a transcoder will receive the HDTV signal(s) from 
a central location via satellite or other network communications link and transcode one or 

15 more SDTV signals from respective HDTV signals. Both HDTV and SDTV signals will 
then be transmitted to the home. - ^. . ^ 

Although initially the transcoding equipment will be placed in the 
transmission chain requiring channels to be provided for both HDTV and SDTV signals, low 
cost ICs will eventually become available to enable the transcoder to migrate to the 

20 consumer's home. The advantage of having the transcoder in receiving chain is that only the 
HDTV signal will actually have to be transmitted (rather than simulcast with the SDTV 
signal) and channels occupied by the SDTV signals will be utilized for other uses. 

Presently, transcoding from HDTV to SDTV is accomplished by 
completely decoding the HDTV signal to form a sequence of high definition images ("HD 

25 image sequence"). The HD image sequence must then be filtered and subsampled to extract 
a sequence of lower definition images ("SD image sequence"). The SD image sequence must 
then be processed to compute SD macroblock information, for example macroblock type 
information, motion vector information and quantizer information, in order to encode it. 
However, as is the case with most broadcast quality video systems, a complete encoder is 

30 expensive and it therefore would not be practical to include one in a cost effective transcoder 
designed to be used in the receiving (or transmitting) chain. 



An object of the instant invention is, therefore, to provide a method and 
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apparatus for performing cost effective transcoding which avoids having to compute SD 
macroblock information from the SD image sequence. 

The instant invention provides a method and apparatus for 
decoding an HDTV signal to provide an HD image sequence and HD macroblock 
5 information pertaining to the coded macroblocks of the HDTV signal, for example picture 
type information, macroblock type information, motion vector information and quantizer 
information; 

filtering and subsampling the HD image sequence to provide an SD image 

sequence; and 

10 using the HD macroblock information to directly derive corresponding SD 

macroblock information (e.g. picture type information, macroblock type information, motion 
vector information and quantizer information) necessary for encoding co-sited SD 
macroblocks. 

By processing the HD macroblock information directly to derive the SD 
15 macroblock information, the invention avoids the necessity of completely analysing the SD 
image sequence in order to derive the SD macroblock information. This simplifies the SD ^ 
encoding process and apparatus since it requires much less memory and less computational 
complexity than the prior art method, and therefore can be effectively implemented in the 
receiving chain. 

20 The preferred embodiments described in this application relate to 

transcoding a compressed HDTV signal to a compressed SDTV signal. In general, the same 
techniques can be applied to transcoding from any given higher resolution and bit-rate 
bitstream to a lower resolution and bit-rate bitstream. 

25 

Figure 1 is a block diagram of a preferred embodiment of a transcoder 

which implements the invention; 

Figure 2 illustrates the relationship between co-sited HD macroblocks and 

SD macroblocks; and 

30 Figure 3 illustrates the relationship between co-sited HD macroblock 

motion vectors and SD macroblock motion vectors. 

The invention transcodes a compressed HD (high definition) video data 
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bitstream into a compressed SD flower, e .g. "standard" definition) data bitstream by utilizing 
HD macroblock information decoded from the HD bitstream to directly derive SD 
macroblock information. As used herein, "directly" means deriving SD macroblock 
information from HD macroblock information without having to compute the SD macroblock 
5 information from its corresponding SD image sequence. 

The references incorporated herein discuss, for example, the MPEG 
digital video protocol and encoders and decoders which can be used to provide both HD and 
* SD digital signal processing. Details of the operation of digital compression and 
coding/decoding operations and equipment are therefore not treated in detail herein. 
10 The HD data bitstream provides coded information (for example 

coefficients, quantization scaling information and motion vectors) related to three types of 
coded picture types. They are I (intraframe) coded pictures, P (forward prediction 
interframe) coded pictures and B (bidirectional prediction interframe) coded pictures. Each P 
picture can include either forward predicted or intraframe coded individual macroblocks. 
15 Each B picture can include either forward predicted, bidirectionally predicted or intraframe 
- coded individual macroblocks. Each I picture can include I coded macroblocks only. The 
HD data bitstream is decoded to re- form the originally encoded HD macroblocks of pixels 
(for example 16 X 16 pixels). These macroblocks are then further processed to form an HD 
image sequence. As shown in Fig. 1 . which describes a preferred embodiment of the 
20 transcoder, the HD information and HD image sequence are derived in HD decoder 10. The 
HD image sequence is then filtered and subsampled to form an SD image sequence in 
subsampler 20 which is then sent to partial SD encoder 40. 

Unlike the prior art, which further processes the SD image sequence in 
order to compute SD macroblock information, the instant invention directly derives SD 
25 macroblock information from HD macroblock information to encode the SD image sequence 
provided by subsampler 20 into an SD coded data bitstream in partial SD encoder 40. 

HD macroblock information comprises mode information for each group 
of four "co-sited" HD macroblocks and provides the mode information to mode selection 
processor 50. The mode information includes: 
30 a) the type of prediction used for each of the macroblocks of the group, 

which includes forward predicted, bidirectional predicted or intraframe coded i.e. no 
prediction (intra), and whether a field or frame prediction is used; 

b) whether or not each respective macroblock comprises quantizer scale 

information; 
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c) whether or not each respective macroblock comprises residual 
coefficient data and if so whether it is field or frame DCT coded; and 

d) whether or not each respective macroblock comprises motion 

information. 

5 Co-sited macroblocks are defined herein as the group of HD macroblocks 

forming the portion of an HD picture from which a corresponding SD macroblock of the SD 
picture is to be formed. The relationship between co-sited HD macroblocks and SD 
macroblocks is shown in Figure 2. 

The relationship between each SD macroblock located at the position x,y 
10 within the SD picture, to a corresponding portion of the HD picture (of size Sx, Sy) located 
at position X,Y within the HD picture is expressed by the equations X=R*x, Y=R*y, 
and Sx=Sy=16*R, where R equals a scale factor in each dimension (x and y). For 
purposes of this explanation we will assume that the aspect ratios of the HD picture and the 
SD picture are the same and that R=2. One SD macroblock (mbl) therefore corresponds to 
15 four HD macroblocks (MB1, MB2, MB3 and MB4). 

- . - • If the SD picture to be formed from the transcoding process is to have an > 

aspect ratio which is different, for example 4 by 3, side panels may be used to select the HD 
video area so that the same aspect ratio can be preserved. For each macroblock in SD, one 
can map it to the corresponding area in HD. All the HD macroblocks fully or partially 
20 covered by this area are used as the co-sited macroblocks of that particular SD macroblock. 

The HD decoder 10 also provides: 

a) the number of bits used to code each respective macroblock, to buffer 

control and adaptive quantizer 30; 

b) quantizer information (q„ D ) for each macroblock, to buffer control and 

25 adaptive quantizer 30; and 

c) motion vector information for each macroblock, to motion estimation 

processor 60. 

In addition to the above information about individual macroblocks. the 
decoder 10 provides information about the type of picture each HD macroblock is pan of 
30 (for example I, P or B) to modules 20,30,50 and 60 of the transcoder. 

Each group of four 16 x 16 pixel HD macroblocks is used to derive a co- 
sited 16 x 16 pixel SD macroblock. 

The macroblock type of each SD macroblock is determined in mode 
selection processor 50 based upon mode information from co-sited HD macroblocks. 
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The macroblock mode relationship can be represented in the following 

general terms: 

t(mbl) = G[ Tl (MB1), T2 (MB2), T3 (MB3) ... , Tn (MBn)] 

where G is a function or operation, n is the total number of the co-sited HD macroblocks, t 
is the mode to be assigned to SD macroblock mbl, and Tl to Tn are the respective modes of 
the co-sited macroblocks. 

The SD macroblock mode can be determined, for example, by a process 
G which determines the respective macroblock modes of each HD macroblocks in the group 
of HD macroblocks MB1 to MBn and keeps count of the number of times each particular 
mode is used in order to determine which mode is used most often in the group. The mode 
used most often in the group of HD macroblocks is then assigned to SD macroblock mbl. 
Likewise, the type of DCT coding most often used to code the residual data among each of 
the co-sited macroblocks MB1 to MBn, is used to determine the type of DCT coding to be 
used in SD macroblock mbl. 

In case there is no mode which represents a plurality (i.e. in case of a 
-tie"), the priority list in Table I can be applied to determine prediction type for a particular 
SD macroblock. 

Table I is based on the heuristics to maximize the overall coding 
performance for the SD video. 

For each HD picture, the corresponding SD picture will have the same 
picture type (I,P or B). Table I is organized according to the picture type. The possible 
macroblock categories which can be assigned to the SD macroblock are listed with the 
highest priority at the top and the lowest priority at the bottom of the column. The use of 
Table I can be illustrated by the following example. 

If the HD co-sited macroblocks are part of a P picture, then the 
corresponding SD macroblock will be determined for a P picture (column with the heading 
P). If there is a "tie" between two intra coded co-sited macroblocks and two field predicted 
macroblocks, the SD macroblock would be intra coded since of the two categories of HD 
macroblock, intra is the highest. 

For each picture type, DCT coding type (i.e. frame or field based) is also 
determined by plurality. In case of a "tie", field DCT is selected for the SD macroblock. 

After the SD macroblock mode is selected the motion vectors of the SD 
macroblock can be determined. Motion compensation is always performed on each SD 
macroblock in SD encoder 40 based on its SD macroblock mode and its motion vectors. 
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Along with the respective derived SD macroblock type, the motion vectors for each group of 
HD macroblocks are used to determine the motion vectors for a co-sited SD macroblock. 

For an intra SD macroblock, no motion vectors are used. For a 

forward predicted SD macroblock the forward frame or field motion vector of mbl is a function 
5 of the forward frame and field motion vectors for the group of HD macroblocks. MB 1 to MB4, 
as shown in Fig. 3 and explained in more detail below. 

For a bidirectional predicted SD macroblock, the forward frame or field 
motion vector of mbl is a function of the forward frame and field motion vectors for MB1 to 
MB4. The backward frame or field motion vector of mbl is a function of the backward 
10 frame and field motion vectors for MB1 to MB4. Once the initial estimates of the motion 
vectors for mbl have been determined, additional motion estimation with these motion 
vectors offset may be carried out for further refinement. 

The motion compensation is then performed by the SD encoder 40 to 

re-calculate the residues. 
15 Initial motion vectors for SD macroblock mbl are estimated in motion 

estimation processor 60 using the HD motion vectors supplied by HD decoder 10 by the 
methods described below. 

The initial estimate of the motion vector for mbl can be determined, for 
example, by dividing the average of motion vectors of HD macroblocks MB1 to MBn in 
20 each direction (forward/backward) by R. In otherwords from MB1 to MBn, the motion 
vectors belonging to the same direction (forward/backward) regardless of their structures 
(field/frame) should be averaged. 

If the required estimate of the initial motion vector is frame based, then 
all the HD field motion vectors are converted to the corresponding HD frame motion 
25 vectors, in the motion estimation processor 60, by dividing the vertical component of the 
respective field motion vector by two before averaging. If the required estimate of the initial 
motion vector is field based, then all the frame motion vectors are converted to the 
corresponding field motion vectors before averaging by multiplying the vertical component of 
these frame motion vectors by two. 
30 Since it is very likely that the co-sited HD macroblocks in each group 

have different types and values of motion vectors, the initial motion vectors for the co-sited 
SD macroblock derived directly from these HD motion vectors may not be very accurate. 
Additional motion estimation as the refinement is therefore required. Given a good initial 
estimate from the co-sited HD motion vectors, the amount of motion estimation needed is 
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still much less than that required by a complete SD encoder. 

In order to allow constant rate transmission of bursty (compressed) data, 
buffers are needed at the encoder and the decoder. Consider the broadcast scenario, where 
there will be one encoder buffer and innumerable decoder buffers (with information 
5 transmitted only in one direction: from the encoder to the decoder). The encoder must 
ensure that none of the decoder buffers either overflow or underflow. MPEG addresses this 
problem by having the encoder generate a video buffer status signal (vbv_delay). The 
encoder transmits the vbv_delay for every picture (also referred to as a frame) in order to 
inform the decoder of the state in which its buffer should be before the start of decoding of 
10 the current picture. 

Since the transcoder generates an SD bitstream from an HD bitstream, it 
needs to make sure that the SD bitstream satisfies the constraints imposed by the need for 
video buffer control in the SD decoder. In other words, since the bit-rate and buffer size 
change for the SD bitstream, the video buffer control information of the HD bitstream (HD 
15 vbv_delay) will have to be modified to appropriate video buffer control information for the 
transcoded SD bitstream (SD vbv_delay). This is achieved in buffer control and adaptive 
quantizer 30 in the following manner. 

The HD encoder ensures that the video buffer conditions are satisfied by 
providing an HD vbv-delay signal, as taught in the references incorporated herein. 
20 Mathematically, the requirement of the HD video buffer control information is; 

0 < OHD < BHD 

where, OHD is the occupancy of a video buffer of a hypothetical decoder coupled to an HD 
encoder, immediately before and immediately after decoding a frame, and, BHD is the size 
of the buffer. 

25 The corresponding requirement of the SD video buffer control information 

is, similarly: 

0< OSD < BSD 

where, OSD is the occupancy of a video buffer of a hypothetical decoder coupled to an SD 
enfcoder, immediately before and immediately after decoding a frame, and, BSD is the size of 
30 the buffer. This requirement can be satisfied by using buffer control and adaptive quantizer 
30 in the transcoder which comprises a buffer controller which takes advantage of the fact 
that if the relationship between buffer occupancy and buffer size stated above is true, a 
restriction on OSD is imposed as defined as follows: 
OSD = (BSD / BHD )OHD. 
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The buffer control and adaptive quantizer 30 receives the number of bits 
used for encoding each HD macroblock and computes the number of bits used to encode 
each HD picture using the formula: 

BITS_SDi = (BSD / BHD )BITS_HDi 
5 where, BITS_SDi is the estimated number of bits to be used to code a corresponding SD 
picture i and BITS_HDi is the actual number of bits used to code the corresponding HD 
picture i. 

It should be noted that this requires that the control over the buffer 
occupancy be tight. In other words, the actual coded bits per picture must be close to the 
10 estimated number of target bits. 

After BITS_SDi is computed for the current SD picture, an average 
quantizer scale (Qi SD ) for the SD picture which would result in BITS_SDi is computed in 
buffer control and adaptive quantizer 30 as follows: 

The complexity of the SD picture (Ci) is represented by the formula 
15 Ci = BITS_SDi * Qi SD 

and similarly, for the previous* SD picture, 

Ci-1 = BITS_SDi-l * Qi-lso- 
In order to achieve a continuity in quality from picture to picture, Ci should be equal to 
Ci-1. Therefore, solving for estimated average quantizer step size, we get: 
20 Qiso - (BITS_SDi-l • Qi-lso) / (BITS_SDi). 

The quantizer step size value for each of the HD macroblocks of the co-sited HD picture 
(qjn,) are provided by HD decoder 10 to buffer control and adaptive quantizer 30. Buffer 
control and adaptive quantizer 30 calculates the average value of the quo values for the HD 
picture in order to provide an average quantizer scale, Qhd. for the current HD picture. 
25 Buffer control and adaptive quantizer 30 can also calculate Qi SD since Qi-1 SD and 

BITS_SDi-l are retained in its buffer. At the beginning of the transcoding process, before a 
first value of Qi-1 SD is available, Qhd can be used in place of Qi-lso- 

The bits available to code an SD picture must be allocated in such a 
manner as to increase its subjective quality without increasing the number of bits used to 
30 encode it. This is achieved by allowing more distortion in those areas of the image which are 
complex (where it is less visible) while quantizing finely those areas that are sensitive to 
noise (for instance, flat and low activity areas). 

The quantization step size q SD for each SD macroblock, which determines 
the distortion introduced in the macroblock and determines the number of bits generated by 
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it, is computed by buffer control and adaptive quantizer 30 from three factors: (1) the 
estimated average quantizer step size, Q SD , (2) the buffer status, OSi SD , (which is obtained by 
reading the current occupancy of the buffer in buffer control and adaptive quantizer 30), and 
(3) the relative complexity of each SD macroblock with respect to the other macroblocks of 
its SD picture. 

As described in the following equation: 

qs D =(Qi, OSDi, cj) 

where the average complexity of a macroblock, cj, is a function of the following quantities 
obtained from the HD coded bitstream: 

cj = f (qHDl»qHD2,qHD3,qHD4 f ---qn„ D ; bl,b2,b3,b4,...bn) 
where b is number of bits for each HD macroblock. For example: 

cj = minimum value from among the following products: 
(blW); (b2qHD2); frSq^); (Mq^). 

The rationale for the above procedure is that coding the SD macroblock based on the 
HD-macroblock that is most sensitive to noise ensures that the "worst" case is taken care of. 
For the same quantization stepi fewer number of bits means lower activity thus most .. ... 
sensitive to noise. 

While the invention has been described in its preferred embodiment, it is 
to be understood that the words which have been used are words of description rather than of 
limitation and that changes within the purview of the appended claims may be made without 
departing from the true scope and spirit of the invention in its broader aspects. 
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Claims : 



j A method for transcoding a first group of macroblocks of a first digital 

television signal, to a co-sited macroblock of second digital television signal, said method 

comprising the steps of: 

a) deriving from each of said first group of macroblocks, corresponding 

5 HD macroblock information; and 

b) deriving SD macroblock information for said co-sited macroblock 
directly from said HD macroblock information so as to form said second digital television 
signal. 

2 The method of claim 1 wherein said HD macroblock information and said 
10 SD macroblock information comprises prediction information, quantizer scale information, 

motion informal " V 

3 An apparatus for transcoding a first group of macroblocks of a first digital 

television signal, to a co-sited macroblock of second digital television signal, said apparatus 
comprising: 

15 a) means for deriving from each of said first group of macroblocks, 

corresponding HD macroblock information; and 

b) means for deriving SD macroblock information for said co-sited 
macroblock directly from said HD macroblock information so as to form said second digital 
television signal. 

20 4. The apparatus of claim 3 wherein said HD macroblock information and 

said SD macroblock information comprises prediction information, quantizer scale 
information, motion information and transform coefficient information. 
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